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i-^ ■ Abstract 

The behavior of the Poisson-Dirichlet distribution with small mutation rate is studied through 
large deviations. The structure of the rate function indicates that the number of alleles is finite 
at the instant when mutation appears. The large deviation results are then used to study the 
asymptotic behavior of the homozygosity, and the Poisson-Dirichlet distribution with symmet- 
ry-) ' ric selection. The latter shows that several alleles can coexist when selection intensity goes to 
infinity in a particular way as the mutation rate approaches zero. 
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?— i ■ 1 Introduction 

For 9 > 0, let V\{6) > 1^(0) > • • ■ be the points of a nonhomogeneous Poisson process with mean 
measure density 

9v~ 1 e~ v ,v > 0. 

Set 

oo 

v{e) = Y J v i {e), 

i=l 

and 

P(9) = (P 1 (9),P 2 (9),...) 



(mm...) (11) 
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Then P(0) and V(9) are independent, and V{6) is a Gamma(8, ^-distributed random variable. 
The law of P(#) is called the Poisson-Dirichlet distribution with parameter 9, and is denoted by 
PD(9). 

The labeled version of the Poisson-Dirichlet distribution, called the Dirichlet process, is defined 
as the law of 

oo 

E e>l/ = J2 p k(0)Sx k , (1.2) 

k=l 

where Xk,k = 1, ... is a sequence of i.i.d. random variables, independent of P(#), with a common 
diffusive distribution v on [0,1], i.e., v({x}) = for every x in [0,1]. The Dirichlet process was 
introduced in [8] as a prior distribution in the context of Bayesian statistics. 

The Poisson-Dirichlet distribution was introduced by Kingman [11] to describe the distribution 
of gene frequencies in a large neutral population at a particular locus with each component Pk{9) 
representing the proportion of the kth most frequent allele. It is the unique reversible measure (cf. 
[6]) of the infinitely many neutral alleles diffusion process with state space 



V = {p = (px,P2, ■ ■ ■ ) : Pi > P2 > ■ ■ ■ > 0, < 1}, 



i=l 



and generator 



1 °° d 2 9 °° d 



defined on an appropriate domain. 

The parameter 9 represents the scaled mutation rate of a population in the context of population 
genetics. In Bayesian statistics, it can be interpreted as the prior sample size. When 9 is large, 
the proportions of different alleles under PD(9) are evenly spread and approach zero; while for 
small values of 9, PD{9) will concentrate mostly around the point (1, 0, ■ ■ ■ ). There are extensive 
studies of the asymptotic behavior of PD(9) when 9 goes to infinity ( [15] , [9] , [10] , [2] , [7]). Since 
the proportions of alleles are evenly spread and uniformly small, it is thus natural to see Gaussian 
structures ([TO], [7]) for large 9. For small mutation rates, the study is very limited. The author is 
aware of only results in [13] for Dirichlet(0 , v) , and in [3] and [5] for the infinitely many neutral 
alleles diffusion model. 

The case of 9 = 1 is special. It appears as an asymptotic distribution in random number theory 
([!]). It is also a critical value in the boundary behavior of the infinitely many neutral alleles model. 
By using techniques from the theory of Dirichlet forms, it was shown in [12] that for the infinitely 
many neutral alleles model, with probability one, there will exist times at which the sample path 
will hit the boundary of a finite-dimensional sub-simplex of V or, equivalently, the single point 
(1,0, • • • ) iff 9 is less than one. The intuition here is that it is possible to have finite number of 
alleles in the population if mutation rate is small. 
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But in equilibrium, with PD{6) probability one, the number of alleles is always infinity as long as 
9 is strictly positive. In other words, the critical value between finite number of alleles and infinite 
number of alleles is zero for PD{6). In physical terms this sudden change from one to infinity 
can be viewed as a phase transition. The objective of this paper is to investigate the microscopic 
structures during this phase transition. The tool we use is from the theory of large deviations. 
Intuitively, it is unlikely to get a large number of alleles when the mutation first appears. Our 
result will confirm this intuition rigorously, and reveals a transition structure that can be viewed 
as a "ladder of energy" . 

The paper is organized as follows. In Section 2, we establish the large deviation principle for 
PD(9) when 9 goes to zero. The rate function is identified explicitly. In Section 3, the large 
deviation result is applied to study the asymptotic behavior of the homozygosity and the impact of 
selection or exponential tilting. It will be shown that, in contrast to the neutral case, the population 
under overdominant selection can preserve more than one alleles when the mutation rate goes to 
zero and the selection intensity goes to infinity in a particular way. 

2 Large Deviations 

In this section, we establish the large deviation principle for PD(9) when 9 goes to zero. The result 
will be obtained through a series of lemmas and the main techniques in the proof are exponential 
approximation and contraction principle ([3]). 

Let U = U(9) be a Beta(l, 9) random variable, E = [0, 1], and A(0) = (- log(0)) _1 . 

Lemma 2.1 The family of the laws ofU{9) satisfies a large deviation principle on E with speed 
\{9) and rate function 



Proof For any a < b in E, let I denote one of the intervals (a, b), [a, b), (a, b], and [a, b]. It follows 
from direct calculation that for b < 1 




(2.1) 




where c = If 6 = 1, then lim^o X(9) log P{U € J} = 0. These, combined with compactness 
of E, implies the result. 



□ 



Next let U\, U2, ... be i.i.d. copies of U and 



X 1 = U u X m = (l-U 1 )---(1- U m ^)U, 



m ! 



m > 2. 



(2.2) 
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Lemma 2.2 For any n > 1, the family of the laws of Pi >n (0) = max{Xi, • • • , X n } satisfies a large 
deviation principle on E with speed \{0) and rate function 



Up) 



o, p = l 



k, pe[A I A),k = l,2,...,n-l (2.3) 



n. else. 



Proof Noting that P\, n (Q) is a continuous function of (U\, . . . , U n ), it follows from Lemma |2. 11 
the independency, and the contraction principle that the family of the laws of Pi t n(G) satisfies a 
large deviation principle on E with speed \{6) and rate function 

n 

I'(p) = inf {y^ I(uj) : ui £ E, 1 < i < n;max{«i, (1 - ui)u 2 , ■ ■ ■ , (1 - u{) ■ ■ ■ (1 - n n ,_i)n n } = p}. 
i=l 

For p = 1, one has = by choosing = 1 for i = 1, . . . , n. If p is in [1/2, 1), then at least 
one of the Uj is not one. By choosing u\ = p, 114 = 1, i = 2, . . . , n, it follows that I'(p) = 1 for p in 
[1/2,1). 

For each m > 2, we have 

max{tti, (1 - ui)u 2 , . . . , (1 - ui) ■ ■ ■ (1 - u m )} (2.4) 
= max{ui, (1 - ui) max{n 2 , . . . , (1 - n 2 ) • • • (1 - u m )}}. 

Noting that 

maxjtti, 1 — U\} > -, u\ G E, 
it follows from (|2.4p and induction that 

max{«i, (1 - m)u 2 , . . . , (1 — «i) • • • (1 — ii m )} > — — -, m e E,i = 1, . . . ,m. (2.5) 

m+l 

Thus, for 2 < k < n — 1, and j> in ^), in order for the equality 

max{ui, (1 - ui)n 2 , . . . , (1 - Hi) ■ ■ • (1 - n n _i)n n } = p 

to hold, it is necessary that U\, u 2 , . . . , uj. are all less than one. In other words, I'(p) > fe- Since the 
function maxjni, (1 — U\)u2, . . . , (1 — u{) ■••(! — u^)} is a surjection from E k into [^rj, 1], there 
exists «i < 1, < 1 such that 

max{ui, (1 - ui)u 2 , . . . , (1 - ui) ■ ■ ■ (1 - u k )} = p. 

By choosing Uj = 1 for j = k + 1, . . . , n, it follows that /'(p) = A;. 
Finally for p in [0, ^), in order for 

max{ui, (1 - ui)«2j ■ ■ ■ , (1 - «i) ■ • • (1 - Mn-i)tt n } = P 
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to have solutions, each m has to be less than one and, thus, I'(p) = n. Therefore, I'(p) = I n (p) for 
all p in E. 

□ 

Lemma 2.3 The laws of Pi (9) under PD(9) satisfy a large deviation principle on E with speed 
\{9) and rate function 

' 0, p=l 

k, J»G[^,i),fc = l,2,... (2.6) 
oo, p = 



Slip) 



Proof Since the law of (Xi,X2, • • • ) is the same as the size-biased permutation of PD(9), it 
follows that Pi(9) under PD(9) has the same distribution as Pi{9) = max{Xj : i > 1}. For any 
5 > 0, it follows from direct calculation that for any n > 1 

P{Pi(9) - P hn (9) > 5} < P{(l-U 1 )---(1-U n )>8} 

which implies that 

limsupA(6')logi :> {Pi(^) - Pi, n {9) > < -n. (2.7) 

Hence {Pi >n (9) : 9 > 0} are exponentially good approximations of {Pi(9) : 9 > 0}. By direct 
calculation, for every closed subset F of E 

inf Si(q) = limsup inf I n (q). 

This, combined with theorem 4.2.16 in [3] and the fact that Si(p) is a good rate function, implies 
that a large deviation principle holds for the laws of Pi with speed X(9) and rate function 

supliminf inf I n (q), 
6>o n - > °° W-p\<5 



which is clearly equal to Si(p). 
For any m > 1, let 

m 

V m = {(pi,— ,Pm) : <p m < ... < Pi,y^Pfe < 1}, (2. 

fc=l 

and set <3m,e to be the law of (Pi(9), ...,P m (9)) under PD{9) on space V m . 
For any 5 > 0, and any (pi, ...,p m ) € V m , let 

G((pi, ...,p m );S) = {(qi, ...,q m ) eV m : \q k - p k \ < 5,k = l, ...,m}, 
F((pi,...,p m );S) = {(qi,...,q rn ) £V m :\q k -p k \<S,k = l,...,m}. 



□ 
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Lemma 2.4 For fixed m > 2, the family {Q m ,e : > 0} satisfies a large deviation principle on 
space V m with speed A(0) and rate function 

0, (Pl,P2,-,Pm) = (1,0...,0) 

l-h 2<l<m,E[ =1 Pk = l,Pi>0 

m + 5 l(l=fel A1 )' Ek=lPk<l,Prn>0 [ ^ 

oo, else. 
Proof Let m > 2 be fixed, and g± denotes the density function of P\{0). Then for any p € (0, 1) 

Hp/(l-p))Al 

gl{p)p(l-p) l - d = 9 &(x)dx, (2.10) 

J 

The joint density function g e m of (P\(6), P m (9)) is given by (cf.|14j) 

" pT^^ ^i-E^pJ' 

for 

m 

(pi, ...,J? m ) eV^ = {(pi, ...,p m ) G V m : < p m < • • • < pi < 1, ^p fc < 1}, 

fc=l 

and is zero otherwise. Thus for any fixed (pi, ...,p m ) £ V m we have 

<£(pi, -,Pm) = 1 ^ =lPfcj / (2.11) 



Pi ' ' ' Pm Jo 

The key step in the proof is to show that for every (pi, ...,p m ) in V r 



lim liminf A(0) logQ m>e (F((pi, ...,p m ); 5)) 

d — ^0 9 — *0 



= limlimsupA(0)logQ mi e(G((pi,...,p m );<5)) (2.12) 

= ~~ S m (pi, p m ). 

For any (pi, ...,p m ) in V m satisfying YnLiPi > °j define 

r = r(pi, . . . ,p m ) = max{i : 1 < i < m,pi > 0}. (2-13) 
We divide the proof into several disjoint cases. 
Case I: r = 1, i.e., (pi, ...,p m ) = (1, 0). 
For any 5 > 0, 

F((l,...,0);<5) C {fe,.. .,<? m ) € V m : | 9l - 1| < 5}, 
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and one can choose 5' < 5 such that 

{(qi,...,q m ) € V m : |gi - 1| < 5'} C ...,0);*). 

These combined with Lemma 12.31 implies ()2. 12[) in this case. 
Case II: r = m, EfcLiPfc < 1 - 

Choose 5 > so that 

S<mm{p m , "^ =lP ' }. 

m 

By (|2.1ip . we have that for any (q 1 ,...,q m ) in F((pi, ...,p m ), S) nV° m 

5m (gi,..,^) < _ a . . . _ ;) y 

which, combined with Lemma 12.31 implies 

lim limsup A(0) log Q m .e{F{(pi, ...,p m );5)} 

< -m + lim dim sup A(0) log P{Pi(0) < 1 J*™ + / - ^ A 1} (2.14) 
<-[m + gx( - ^ m m Al)], 

where the right continuity of Si(-) is used in the last inequality. 
On the other hand, let 

m g 

G(pi, ..., Pm ), 5) = JJ(pi + -, Pi + S)n v° rn , 
1=1 

which is clearly a subset of G((pi, ...,p m ), S). Using (|2.1ip again it follows that for any (qi, q m ) 
in G((pi,...,p m ),(5) 

(i-Sg =1 ( P » + g/2))«-i /■(^/»)/a-EE= 1 (p^)))Ai 



o 



which, combined with Lemma 12.31 implies 



liminf A(0) logQ m ,e{G((pi, ...,p m ); 5)} > liminf A(0) logQ m ,e{G((pi, ...,p m ); «5)} 

P S"0 P ^0 

> -m-5i(- m Al). 

1 - Ei=i(Pi + <V 2 ) 

It follows, by letting 5 go to zero, that 

liminf liminf -log Q m ,e{G((px, ...,p m ); 5)} > -S m (pi, -,Pm)- (2.15) 

<5^0 6^00 a 



Case III: 2 < r < m — 1, X^[=i Pi < 1 or Pi = 0- 

This case follows from estimate (|2.14p and the fact that S'i(O) = — oo. 
Case IV: r = m, Yuk=iPk = 1- 

Noting that for any J > 

F((pi, ...,p m );£) nv; C {(<?i, ...,? m ) G V£> : \qi - Pi\ <6,i = l, ...,m- 1}. 
By applying Case II to (Pi(0), P m _i(0)) at the point (pi, ...,p m _i), we get 

limlimsupA(0)logQ m , {F((p 1 ,...,p m ); ( 5)} < -[m - 1 + £i(l)] = -(m - 1). (2.16) 

On the other hand, one can choose 5 > small so that iz^y 1 q . > 1 f° r an y —iQm) m 
G((p u ...,p m )-5)nV° m . 
Set 

<5 = {(<?i, ...,g m ) G :pi <q { <pi + 6/(m -l),i= 1, ...,m - l;p m - <5 < g m < p m }. 
Clearly G is a subset of G((pi, ...,p m );(T). It follows from ()2.1ip that for any (q±, ...,q m ) in G, 

9 e (qi,..,qm) > 

m m (pj + 8/(m — 1)) • • • (p m -i + S/(m - l))p m 

For m > 2, let 

m— 1 

An = {(<?i, -,?m-i) G V m _i : ^ < ft, < pi + S/(m - = 1, ...,m - 1, ^ ^ < 1} 



Then 



J{l-Y J ^ e ~ 1 dqi---dq m 

i=l 

r fhMi-ES 1 ?.) - 

/ dqi ■ ■ ■ dq m -i \ 9(1 - 2^ 9i) dq r , 

J A m J Vm—S ■ 1 



'Pm-<5 j =1 

m— 1 



/ (1 + 6 - p m - ^ Qifdqi ■ ■ ■ dq m - 1} 

J Am i=1 



which converges to a strictly positive number depending only on 5 and (pi, ...,p m ) as 9 goes to zero. 
Hence 

limliminf A(#) log Q m , {G((pi, ..,p m ); 5)} > lim liminf X(9) logQ m>e {G} > -(ro-1). (2.17) 

o — >0 9 — >0 o — >0 9 — >0 



Case V: 2 < r < m - l,Y!i=iPi = 1 - 

First note that for any 5 > 0, F((pi, ...,p m ); 5) is a subset of 

{(gi, ...,q m ) G V m : |g 4 < <S,i = 1, ...,r}. 

On the other hand, for each 5 > one can choose So < 5 such that for any 8' < 5q 

G{(pi, ...,p m );5) D {{qi, ...,q m ) G V°,; \q { - pi\ <5',i = 1, ...,r}. 

Thus the result now follows from Case IV for (Pi(9), ...,P r (9)). 
The lemma now follows from (j2. 12p and. th.6 fact that 

is compact. 
For any n > 1, set 

n 

L n = {(pi,...,p n ,0,0,...) G V : = 1} 

i=l 

and 

oo 

L=\JU. 

i=i 

Now we are ready to state and prove the main result of this section. 

Theorem 2.5 The family {PD{6) : 9 > 0} satisfies a large deviation principle with speed X(9) and 
rate function 



□ 



5(p) 



0, p G Li 

rc-l, p G L n ,p n > 0, n > 2 (2.18) 
oo, p g" L 



Proof First note that the topology of the space V can be generated by the following metric 

k=l 

where p = (j>i,P2, ■■■), q = (<?i, Q2, •••)■ For any fixed 5 > 0, let B(p,5) and -B(p, 5) denote the 
respective open and closed balls centered at p with radius 5 > 0. 

We start with the case that p is not in L. 

For any k > 1, 8' > 0, set 

-Sfc,5'(p) = {(Ql,Q2, •••) G V : |<& <8',i = l,...,k}. 
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Choose 5 > so that 2 k 5 < 5'. Then 



B(p,8) c i? M /(p), 

and 

limlimsupA(0)logP£>(0){ij(p,($)} < limsupA(0)logP.D(0){.B fc $/(p)} 

< limmipA(0)logQ M {F((pi ) ... > p fc ),<5')} (2-19) 

< - inf{5jt(gi, q k ) ■ (qi, qk) G ^((pi, -,Pfc)> £')}• 
Letting <5' go to zero, and then k go to infinity, we get 

limliminf \(6) log PD (6) {B(p, 5)} = lim limsup A(0) log PD(9){B(p, 5)} = -oo. (2.20) 

Next consider the case of p belonging to L. Without loss of generality, we assume that p belongs 
to L n with p n > 0. 

For any 5 > 0, let 

G{p;6) = {q G V : \q k - p k \ < 5, k = 1, ...,n}, 
F(p;5) = {qe V: \q k -p k \ <S,k = l,...,n}. 

Clearly, B(p,5) is a subset of F(p;2 n 5). Since Y^i=iPi = ^ follows that, for any 5 > 0, one 
can find 5' < 5 such that 

B(p,«5)DG(p;5 / )- 

Using results on (Pi(0), P n (6)) in Case V in the proof of Lemma 12.41 we get 
limliminf A(0) log PD (0)(B(p, 6)) = lim lim sup A(0) log PD(9)(B(p,S)) = -(n- 1). (2.21) 

(5— >0 9^0 5— >0 0_>o 

Finally, the theorem follows from the compactness of V. 

□ 

Remarks. 1. Consider the rate function S(-) as an "energy" function, then the energy needed to 
get n > 2 different alleles is n — 1. The values of S(-) form a "ladder of energy" . The energy needed 
to get infinite number of alleles is infinity and thus it is impossible to have infinitely many alleles 
under large deviation. 

2. The effective domain of £(•), defined as {p € V : S(p) < oo}, is clearly L. This is in sharp 
contrast to the result in [2] where the rate function associated with large mutation rate has an 
effective domain of {p € V : YliLiPi < !}• The two effective domains are disjoint. One is part of 
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the boundary of V and the other is the interior of V, and both have no intersections with the set 

{pG V:pi >P2-- - >0, E~iPi = l}- 

3. The large deviation result with small mutation rate is limited to PD(9). Both the Dirichlet 
process and the two-parameter Poisson-Dirichlet distribution converge to random limits when mu- 
tation rate becomes small. Thus it is not clear how to formulate large deviation problems for them 
when the mutation rate approaches zero. 



3 Applications 

In this section we will discuss two applications of Theorem 12.51 The first one is concerned with the 
large deviation principle for the homozygosity. 

A random sample of size r > 2 is selected from a population whose allelic types have distribution 
PD{9). The probability that all samples are of the same type is called the rth order population 
homozygosity and is given by 

oo 

H r {P 1 {9\...)=Y J Pl{0). (3.1) 

i=i 

It is clear that H r (Pi(9), ...) converges to one as 9 approaches zero. Our next theorem describes 
the large deviations of H r (9) from one. 

Theorem 3.1 For any integer r > 2, the family of the laws of H r (Pi(9), ...) satisfies a large 
deviation principle on E with speed \{9) and rate function 

' 0, p = l 

J(p) = l n-1, p6[^ )5 ^p),n = 2,... (3.2) 
^ oo, p = 

Thus in terms of large deviations, H r (Pi(9), ...) behaves the same as P^~ l {9). 

Proof For any integer r > 1, H r (p) is clearly continuous on V. By Theorem 12.51 and the 
contraction principle, the family of the laws of H r (P\{9), ...) satisfies a large deviation principle 
with speed X(9) and rate function 

inf{5(q) : q G V, ff r (q) = p} = inf{S(q) : q G L, ff r (q) = p}. 

For p = 1, it follows by choosing q = (1,0,...) that inf{5(q) : q G V,if r (q) = p} = 0. For 
p = 0, there does not exist q in L such that H r (q) = p. Hence inf{5(q) : q G L, H r (q) = p} = oo. 

For any n > 2, the minimum of Y27=i $1 over ^ n ^ s n ~^ r ~ 1 ^ which is achieved when all q^s are 
equal. Hence for 

PG [n-< r - 1 ),(n-l)-( r - 1 >), 
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we have 

inf{,S(q) :qeV,fi r (q)=j)}=n-l = J(p). 

□ 

For any a{9) > and any nonzero constant s, the Poisson-Dirichlet distribution with symmetric 
selection is a probability measure on V given by 

P a{ 8),sH r ,e(dp) = ( / e sa W^Wp£>(0)(dq))- 1 e-W^(P)p J D(0)( C Zp), 
iv 

where a{6) is the selection intensity s > 0(< 0) corresponds to underdominant (overdominant) 
selection. 

In our second application, Theorem 12.51 is used to derive the large deviation principle for the 
Poisson-Dirichlet distribution with symmetric selection. 

Theorem 3.2 The family {P a (e),sH r ,e : $ > 0} satisfies a large deviation principle on V with speed 
\(6) and rate function 



S'(P) 



5(p), \im _> o a{6)\(6) = 

5(p) + a(l-ir r (p)), a(6)\(6) = l,s > (3.3) 

_ 5(p) + \s\H r (p) - inf{^ + n - 1 : n > 1}, a(0)A(0) = 1, s < 



Proof Theorem 12. 5( combined with Varadhan's lemma and the Laplace method, implies that 
the family {P a (e),sH r ,e ■ > 0} satisfies a large deviation principle on V with speed X(9) and rate 
function 

sup{ S tf r (q) - 5(q) : q € V} - (sH r (p) - S(p)). 



The theorem then follows from the fact that 
sup{s# r (q) - 5(q) : q G V} = J 



s > 

inf{-l^T +n- 1 : n > 1}, s<0 



□ 

Remark. It is clear from the above theorem that selection has an impact on large deviations 
only when the selection intensity is comparable to A(0)~ 1 . Assuming that a(6) = (X(9))~ 1 . When 
s > 0, homozygote has selective advantage. Thus the small mutation rate limit is (1, ...) and large 
deviations from it become more difficult comparing to the neutral case. This is confirmed through 
the fact that S'(p) is bigger than S(p). For s < 0, heterozygote has selection advantage. The 
fact that S'(-) may reach zero at a point that is different from (1, ...) shows that several alleles 
can coexist in the population when the overdominant selection goes to infinity and the mutation 
approaches zero. In some cases such as r = 2, s = —k(k + 1), k > 1, S'(-) is zero at more than one 
point. 
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